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Abstract 

Generally, when genetic programming (GP) is used for function synthesis any valuable experience gained by the system is 
lost from one problem to the next, even when the problems are closely related. With the aim of developing a system which 
retains beneficial experience from problem to problem, this paper introduces the novel Node-by-Node Growth Solver (NNGS) 
algorithm which features a component, called the controller, which can be adapted and improved for use across a set of 
related problems. NNGS grows a single solution tree from root to leaves. Using semantic backpropagation and acting locally 
on each node in turn, the algorithm employs the controller to assign subsequent child nodes until a fully formed solution is 
generated. 

The aim of this paper is to pave a path towards the use of a neural network as the controller component and also, separately, 
towards the use of meta-GP as a mechanism for improving the controller component. A proof-of-concept controller is 
discussed which demonstrates the success and potential of the NNGS algorithm. In this case, the controller constitutes 
a set of hand written rules which can be used to deterministically and greedily solve standard Boolean function synthesis 
benchmarks. Even before employing machine learning to improve the controller, the algorithm vastly outperforms other well 
known recent algorithms on run times, maintains comparable solution sizes, and has a 100% success rate on all Boolean 
function synthesis benchmarks tested so far. 
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1. Motivation 



Most genetic programming (GP) 12 systems don’t adapt or 
improve from solving one problem to the next. Any experi¬ 
ence which could have been gained by the system is usually 
completely forgotten when that same system is applied to a 
subsequent problem, the system effectively starts from scratch 
each time. 

For instance, consider the successful application of a clas¬ 
sical GP system to a standard n-bits Boolean function syn¬ 
thesis benchmark (such as the 6-bits comparator as described 
in El). The population which produced the solution tree is not 
useful for solving any other n-bits Boolean benchmark (such 


as the 6-bits multiplexer). Therefore, in general, an entirely 
new and different population must be generated and undergo 
evolution for each different problem. This occurs because 
the system components which adapt to solve the problem (a 
population of trees in the case of classical GP) become so 
specialised that they are not useful for any other problem. 

This paper addresses this issue by introducing the Node- 
by-Node Growth Solver (NNGS) algorithm, which features a 
component called the controller, that can be improved from 
one problem to the next within a limited class of problems. 

NNGS uses Semantic Backpropagation (SB) and the con¬ 
troller, to grow a single S-expression solution tree starting 
from the root node. Acting locally at each node, the controller 
makes explicit use of the target output data and input argu¬ 
ments data to determine the properties (i.e. operator type or 
argument, and semantics) of the subsequently generated child 
nodes. 

The proof-of-concept controller discussed in this paper 
constitutes a set of deterministic hand written rules and has 
been tested, as part of the NNGS algorithm, on several popular 
Boolean function synthesis benchmarks. This work aims 
to pave the way towards the use of a neural network as an 
adaptable controller and/or, separately, towards the use of 
meta-GP for improving the controller component. In effect, 
the aim is to exploit the advantages of black-box machine 
learning techniques to generate small and examinable program 
solutions. 

The rest of this paper will proceed as follows: Section[2] 
outlines other related research. Section El details semantic 
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backpropagation. A high level overview of the NNGS system 
is given in Section [4] and Section [5] describes the proof-of- 
concept controller. Section [6] details the experiments con¬ 
ducted. The experimental results and a discussion is given in 
Section [7] Section [8]conc hides with a description of potential 
future work. 

2. Related Work 

The isolation of useful subprograms/sub-functions is a related 
research theme in GP. However, in most studies subprograms 
are not reused across different problems. In [j2] for instance, 
the hierarchical automatic function definition technique was 
introduced so as to facilitate the development of useful sub¬ 
functions whilst solving a problem. Machine learning was 
employed in (3 to analyse the internal behaviour (semantics) 
of GP programs so as to automatically isolate potentially 
useful problem specific subprograms. 

SB was used in (4j to define intermediate subtasks for GP. 
Two GP search operator were introduced which semantically 
searched a library of subtrees which could be used to solve 
the subtasks. Similar work was carried out in j6][2]], however 
in these cases subtree libraries were smaller and static, and 
only a single tree was iteratively modified as opposed to a 
population of trees. 

3. Semantic Backpropagation (SB) 

Semantic backpropagation (SB) within the context of GP is 
an active research topic mmmm. 

Consider the output array produced by the root node of a 
solution tree, where each element within the array corresponds 
to the output from one fitness case. This output array is 
the semantics of the root node. If the solution is perfectly 
correct, the output array will correspond exactly to the target 
output array of the problem at hand. In a programmatic style, 
the output array of a general node nodejc will be denoted 
as nodejc. outputs and the output from fitness case i will be 
denoted by no£/ejc.outputs[/]. 

Each node within a tree produces an output array, a feature 
which has been exploited in f3| to isolate useful subtrees. The 
simplest example of a tree (beyond a single node) is a triplet 
of nodes: a parent node node_a , the left child node nodeJb, 
and the right child node nodejc. 

As a two fitness case example, suppose that a triplet is 
composed of a parent node node_a representing the operator 
AND, a left child node nodeJb representing input argument 
A 1 = [0,1], and a right child node node-C representing input 
argument A2 = [1,1]. The output array of the parent node is 
given by: 

nodeja .outputs = node Jb.outputs AND node_c .outputs 

= [0,1] AND [1,1] (1) 

= [o,i]. 


On the other hand, given the output array from the par¬ 
ent node of a triplet nodeM, it is possible to backpropagate 
the semantics so as to generate output arrays for the child 
nodes, if the reverse of the parent operator is defined carefully. 
This work will exclusively tackle function synthesis problems 
within the Boolean domain, and therefore, the standard 12] O 
set of Boolean operators will be used: AND, OR, NAND, and 
NOR. 


Figure 1. Function tables for the reverse operators: AND 1 , 
OR *, NAND' 1 , and NOR^ 1 . 

Figure [ljgives function tables for the element-wise reverse 
operators: AND ~ 1 , OR 1 , NANI) ! , and NO IN 1 (their use 
with ID arrays as input arguments follows as expected). As 
an example use of these operators the previous example will 
be worked in reverse: given the output array of nodeM, the 
arrays nodeJb. outputs and node-C. outputs are calculated as: 

node Jo. outputs, node c.outputs =AND ] (node a.outputs) 

= AND- 1 ([ 0 , 1 ]) 

= [ 0 , 1 ], [#, 1 ] 

or 

= [#.!]. [ 0 , 1 ]- 

( 2 ) 

The hash symbol # in this case indicates that either 1 
or 0 will do. Note that two different possible values for 
node Jj.outputs and node_c. outputs exist because AND~ 1 (0) = 
(0,#) or (#, 0). This feature occurs as a result of rows 4 
and 5 of the NAND 1 function table. Note that each of the 
other reverse operators have similar features, specifically for: 
O/r^l^AAJVTr^l), and NOR~\ 0). 

Note also, that for any array loci i in nodeM. outputs where 
node m. outputs[i] = #, it is true that node J?.outputs[/] = # 
and nodeM. outputs[i] = #. For example, 1V<9R -1 ([1,#]) = 
([0,#],[0,#]). 

Using the reverse operators in this way, output arrays can 
be assigned to the child nodes of any parent node. The child 
output arrays will depend on two decisions: Firstly, on the 
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operator assigned to the parent node, as this is the operator 
that is reversed. And secondly, on the choices made (note the 
AND~ ] (0) example above), at each loci, as to which of the 
two child output arrays contains the # value. These decisions 
are made by the controller component. 

Using these reverse operators for SB can only ever pro¬ 
duce a pair of output arrays which are different from the 
problem target outputs in two ways. Firstly, the output ar¬ 
rays can be a flipped (using the NOT gate on each bit) or an 
un-flipped versions of the problem target outputs. Secondly, 
some elements of the output arrays will be # elements. 

4. Node-by-Node Growth Solver (NNGS) 



Figure 2. A visual representation of the NNGS algorithm 
during the development of a solution tree. 

A visual representation of the NNGS algorithm can be 
seen in Fig. [2] which shows a snapshot of a partially generated 
solution tree. This tree, in it’s unfinished state, is composed 
of: AND and OR operators, an input argument labelled Al, 
and two unprocessed nodes. The basic role of the NNGS 
algorithm is to manage growing the solution tree by passing 
unprocessed nodes to the controller and substituting back the 
generated/returned node triplet. 

Algorithm[T]gives a simple and more thorough explanation 
of NNGS. In line [2]the output values of the solution tree root 
node are set to the target output values of the problem at hand. 
The output values of a node are used, along with the reverse 
operators, by the controller (line [9]) to generate the output 
values of the returned child nodes. The controller also sets 
the node type (they are either operators or input arguments) 
of the input parent node and generated child nodes. 

Nodes which have been defined by the controller as input 
arguments (with labels: Al, A2, A3... etc.) can not have child 
nodes (they are by definition leaf nodes) and are therefore 


Algorithm 1 The Node-by-Node Growth Solver 
NNGS(target_outputs, controller) 

1 solution Tree -t— { } 

2 root jiode. outputs •<— target_outputs 

3 unprocessed_nodes •<— {root-node} 

4 while len(unprocessed_nodes) > 0 do 

3 nodeja 4- unprocessed_nodes.pop() 

> check for leaf node 

6 if nodeM. type = ’argument’ then 

7 solution Tree. insert(/;«t/e_a) 

8 continue > move on to next node 

9 nodeM, nodeJ?, nodejc <r- 

controller (nodeM, target_outputs) 

10 unprocessed_nodes.insert({«oc/e J?, nodejc}) 

11 solution Tree.insert(«oc/e_a) 

12 return solution tree 


not processed further by the controller (line [6|. When ev¬ 
ery branch of the tree ends in an input argument node, the 
algorithm halts. 

Note that the controller may well generate a triplet where 
one or more of the child nodes require further processing. In 
this case the NNGS algorithm will pass these nodes back to 
the controller at a later stage before the algorithm ends. In 
effect, by using the controller component the NNGS algorithm 
simply writes out the solution tree. 

5. Proof-Of-Concept Controller 

Given an unprocessed node, the controller generates two child 
nodes and their output arrays using one of the four reverse 
operators. It also sets the operator type of the parent node to 
correspond with the chosen reverse operator that is used. 

The ultimate goal of the controller is to assign an input 
argument to each generated child node. For example, sup¬ 
pose that the controller generates a child node with an out¬ 
put array node_b .outputs = [0,1,1,#] and that an input argu¬ 
ment is given by Al = [0,1,1,0]. In this case, nodeJb can 
be assigned (can represent) the input argument Al because 
[0,1,1,#] = [0,1,1,0]. The algorithm halts once each leaf 
node has been assigned an input argument. 

Before giving a detailed description of the proof-of-concept 
controller, there are a few important general points to stress: 
Firstly, the entire decision making process is deterministic. 
Secondly, the decision making process is greedy (the per¬ 
ceived best move is taken at each opportunity). Thirdly, the 
controller does not know the location of the input node within 
the solution tree. The controller has priori knowledge of the 
input argument arrays, the operators, and the reverse oper¬ 
ators only. Furthermore, the controller, in its current state, 
does not memorise the results of it’s past decision making. 
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In this regard, when processing a node, the controller has 
knowledge of that node’s output array only. In this way, the 
controller acts locally on each node. Multiple instances of the 
controller could act in parallel by processing all unprocessed 
nodes simultaneously. 

5.1 Step-by-step 

This subsection will give a step-by-step run-through of the 
procedure undertaken by the proof-of-concept controller. Fig- 
ure[3]serves as a diagrammatic aid for each major step. 

5.1.1 Step 1 

Given an input (parent) node nodeM , and for each reverse 
operator in Table |T] the first step taken by the controller is 
to generate output arrays for each of the child nodes. In 
the example given in step 1 of Fig.[3]only the OR reverse 
operator is used. The OR~ x reverse operator generates # val¬ 
ues in the child output arrays due to the following property 
0/? -1 (l) = (1,#) or (#, 1). In this step, whenever the oppor¬ 
tunity arises (regardless of the reverse operator employed), 
all generated # values within the child output arrays will be 
placed in the output array of the right child node nodejc. For 
example in the case of 0/? _1 (l): (1,#) will be used and not 
(#, 1 ). 

Note that the reverse operators propagate all # elements 
from parent to child nodes. This feature is exemplified in 
step 1 of Fig. [3] by the propagation of the # value at lo¬ 
cus 4 of node_a. outputs to loci 4 of both node Jj. outputs and 
nodejc. outputs. 

5.1.2 Step 2 

By this step, the controller has generated four different (in 
general) node J>. outputs arrays, one for each reverse operator. 
The goal for this step is to compare each of those arrays to 
each of the possible input argument arrays (Al, A2... etc). As 
an example, in step 2 of Fig. [3]the generated nodeJb. outputs 
array is compared to the A2 input argument array. 

Two separate counts are made, one for the number of 
erroneous 0 argument values Eq and one for the number of 
erroneous 1 argument values £j (highlighted in blue and red 
respectively in Fig. |3j. Two further counts are made of the 
number of erroneous nodeJb loci, for 0 and 1 input arguments 
values, which could have been # values (and therefore not 
erroneous) had the controller not specified in step 1 that all # 
values should be placed in the node_c. outputs array whenever 
possible. These last two statistics will be denoted by Mq and 
Mi for 0 and 1 input arguments values respectively. These four 
statistics form an error table for each reverse operator-input 
argument pair. 
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Figure 3. Diagrammatic aid for the proof-of-concept 
controller. 


5.1.3 Step 3 

In this step, the controller sorts the error tables by a number of 
statistics. Note that Mo — Eq and M\—E\ are the number of 
remaining erroneous 0 argument values and erroneous 1 argu¬ 
ment values respectively if all # values were moved from the nodejc .outputs array to the node outputs array whenever 
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possible. To simplify matters we note that 

if M\ — E\ < Mo — Eq 
let k = 1 , j = 0 
otherwise 

let k = 0, j = 1. 

Each error table is ranked by (in order, all increasing): 
Mj. — Ek, Efr, Mj — Ej, Ej, and the number of # values in 
node_c. outputs. In a greedy fashion, the very best error table 
(lowest ranked) will be select for the next step (in Fig. [3]the 
OR-A2 error table is selected). Note that the ranked list of 
error tables might need to be revisited later from step 5. 

5.1.4 Step 4 

The error table selected in step 3 effectively serves as an 
instruction which details how the nodedb. outputs and 
node-C. outputs arrays should be modified. The goal of the 
controller is to move the minimum number of # values from 
the node.c. outputs array to the node_b. outputs array such as 
to satisfy the error count for either l’s or 0’s in one of the 
input arguments. In the example given in Fig. [3] two # values 
in nodeuc. outputs are swapped with l’s in nodeJy. outputs. 

5.1.5 Step 5 

In this step, the algorithm checks that the generated 
nodeJb. outputs and nodejc. outputs arrays do not exactly equal 
either the parent node nodeju or the grand parent node node_p 
(if it exists). If this check fails, the algorithm reverts back to 
step 3 and chooses the next best error table. 

5.1.6 Step 6 

The final step of the algorithm is to appropriately set the 
operator type of node_a given the final reverse operator that 
was used. In this step the algorithm also checks whether either 
(or both) of the child nodes can represent input arguments 
given their generated output arrays. 

6. Experiments 

The Boolean function synthesis benchmarks solved in this 
paper are standard benchmarks within GP research ||2] [3] 0 
[6). They are namely: the comparator 6bits and 8bits (Cm- 
pXX), majority 6bits and 8bits (MajXX), multiplexer 6bits 
and 1 lbits (MuxXX), and even-parity 6bits, 8bit, 9bits, and 
lObits (ParXX). 

Their definitions are succinctly given in 0: 

“For an v-bit comparator Cmp v, a program is required 
to return true if the v/2 least significant input bits encode a 
number that is smaller than the number represented by the v/2 
most significant bits. In case of the majority Maj v problems, 
true should be returned if more that half of the input variables 
are true. For the multiplexer Mul v, the state of the addressed 
input should be returned (6-bit multiplexer uses two inputs 
to address the remaining four inputs, 11-bit multiplexer uses 


three inputs to address the remaining eight inputs). In the 
parity Par v problems, true should be returned only for an odd 
number of true inputs.” 

The even-parity benchmark is often reported as the most 
difficult benchmark j2j. 

7. Results and Discussion 

The results are given in Table [T] and show that the NNGS 
algorithm finds solutions quicker than all other algorithms 
on all benchmarks with the exception of the ILTI algorithm 
on the Muxl 1 benchmark. A significant improvement in run 
time was found for the Par08 benchmark. 

The solution sizes produced by the NNGS algorithm are 
always larger than those found by the BP4A and ILTI algo¬ 
rithms with the exception of the Cmp06 results. The RDO 
scheme and ILTI algorithm both relay on traversing large tree 
libraries which make dealing with large bit problems very 
computationally intensive. As such, these methods do not 
scale well in comparison to the NNGS algorithm. 

It is a clear that NNGS is weakest on the Mux 11 bench¬ 
mark. In this case a very large solution tree was found which 
consisted of 12,373 nodes. The multiplexer benchmark is 
significantly different from the other benchmarks by the fact 
that only four input arguments are significant to any single 
fitness case: the three addressing bits and the addressed bit. 
Perhaps this was the reason why the chosen methodology 
implemented in the controller resulted with poor results in 
this case. 

8. Further Work 

There are two possible branches of future research which 
stem from this work, the first centres around meta-GP. As a 
deterministic set of rules, the proof-of-concept controller is 
eminently suited to be encoded and evolved as part of a meta- 
GP system. The difficulty in this case will be in appropriately 
choosing the set of operators which would be made available 
to the population of controller programs. 

The second avenue of research which stems from this 
work involves encoding the current proof-of-concept con¬ 
troller within the weights of a neural network (NN). This can 
be achieved through supervised learning in the first instance by 
producing training sets in the form of node triplets using the 
current controller. A training set would consist of randomly 
generated output arrays and the proof-of-concept controller 
generated child output arrays. In this way, the actual Boolean 
problem solutions do not need to be found before training. 

As part of the task of find a better controller, the weights 
of the NN could be evolved using genetic algorithms (GA), 
similar to the method employed by |[8|. The fitness of a NN 
weight set would correspond to the solution sizes obtained by 
the NNGS algorithm when employing the NN as a controller: 
the smaller the solutions, the better the weight set fitness. Us¬ 
ing the proof-of-concept controller in this way would ensure 
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Table 1 . Results for the NNGS algorithm when tested on the Boolean benchmarks, perfect solution were obtained for each run. 
BP4A columns are the results of the best performing algorithm from |3] (* indicates that not all runs found perfect solution). 
The RDOp column is taken from the best performing (in terms of fitness) scheme in (4) (note that in this case, average success 
rates and average run times were not given). 



Run time [seconds] 

Program size [nodes] 


NNGS 

BP4A 

ILTI 

NNGS 

BP4A 

ILTI 

RDO 

Cmp06 

0.06 

15 

9 

99 

156 

59 

185 

Cmp08 

0.86 

220 

20 

465 

242 

140 

538 

Maj06 

0.19 

36 

10 

271 

280 

71 

123 

Maj08 

3.09 

2019* 

27 

1391 

563* 

236 

- 

Mux06 

0.21 

10 

9 

333 

117 

47 

215 

Mux 11 

226.98 

9780 

100 

12373 

303 

153 

3063 

Par06 

0.35 

233 

17 

515 

356 

435 

1601 

Par08 

5.73 

3792* 

622 

2593 

581* 

1972 

- 

Par09 

25.11 

- 

5850 

5709 

- 

4066 

- 

ParlO 

120.56 

- 

- 

12447 

- 

- 

- 


that the GA population would have a reasonably working 
individual within the initial population. 

A NN may also serve as a reasonable candidate controller 
for extending the NNGS algorithm to continuous symbolic 
regression problems. In this case, the input arguments of the 
problem would also form part of the NN’s input pattern. 
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